Picture for Yuhao Wu

Yuhao Wu

Helen

Diagnosing Harmful Continuation in Answer-Correct Long-CoT Training Traces

Add code
May 28, 2026
Viaarxiv icon

GeoFaith: A Spatio-Temporal Dual View of Faithful Chain-of-Thought

Add code
May 26, 2026
Viaarxiv icon

Behavioral Integrity Verification for AI Agent Skills

Add code
May 12, 2026
Viaarxiv icon

From Perception to Action: An Interactive Benchmark for Vision Reasoning

Add code
Feb 24, 2026
Viaarxiv icon

Kimi K2.5: Visual Agentic Intelligence

Add code
Feb 02, 2026
Viaarxiv icon

Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities

Add code
Jan 29, 2026
Viaarxiv icon

AdvJudge-Zero: Binary Decision Flips in LLM-as-a-Judge via Adversarial Control Tokens

Add code
Dec 19, 2025
Figure 1 for AdvJudge-Zero: Binary Decision Flips in LLM-as-a-Judge via Adversarial Control Tokens
Figure 2 for AdvJudge-Zero: Binary Decision Flips in LLM-as-a-Judge via Adversarial Control Tokens
Figure 3 for AdvJudge-Zero: Binary Decision Flips in LLM-as-a-Judge via Adversarial Control Tokens
Figure 4 for AdvJudge-Zero: Binary Decision Flips in LLM-as-a-Judge via Adversarial Control Tokens
Viaarxiv icon

NOTAM-Evolve: A Knowledge-Guided Self-Evolving Optimization Framework with LLMs for NOTAM Interpretation

Add code
Nov 11, 2025
Viaarxiv icon

Kimi Linear: An Expressive, Efficient Attention Architecture

Add code
Oct 30, 2025
Viaarxiv icon

Expertise need not monopolize: Action-Specialized Mixture of Experts for Vision-Language-Action Learning

Add code
Oct 16, 2025
Viaarxiv icon